Dynamic Chord Analysis
نویسندگان
چکیده
In this paper, we present a new method for chord recognition from audio. This method builds a graph of all possible chords and selects the best path in this graph. A rule-based approach is adopted to enumerate chord candidates from groups of notes by considering compatible chords and compatible keys. The distance proposed by Lerdahl is then used to compute costs between different chord candidates. Dynamic programming is also involved to select the best path among chord candidates. Experiments are performed on a MIDI song database, divided in different music styles. Then the proposed system is compared to the Melisma Music Analyzer software proposed by Temperley. Results show that our method has a comparable efficiency and provides not only the root of the chord, but also its mode (major or minor). The proposed system is still open and is able to support more chord types if correct rules to handle them are specified. 1. CHORD REPRESENTATION A chord can be defined in different ways. A chord can simply be defined as the perception of several notes sounding at the same time. However, this first definition is not entirely compatible with the jazz chord notation, where each bar may be labeled with a chord. In this case, the pianist, for example, is not supposed to play a whole note of the specified chord: his only restriction is to play notes or groups of notes belonging mostly to the specified chord (some ornaments might be played, which do not belong to the chord). The difference between these two possible definitions of a chord is illustrated in Figure 1. Depending on the definition adopted, it is possible to see either 8 chords or only a single one in this excerpt. In this paper, we choose the second definition, and we use chord to mean the common label a common sequence of notes or groups of notes that could sound at the same time (like the Bb6 chord label in Figure 1). These consecutive notes or groups of notes within the same chord are called note segments of this chord. Each note segment of a given chord may not contain all the notes of the chord, and may contain ornaments (notes which do not belong to the chord). Figure 1. An excerpt of Bohemian Rhapsody by Queen. According to a first possible definition, there are 8 different chords in the presented bar (as much as groups of notes sounding at the same time). According to a second possible definition, there is only one single chord in this bar (characterized by the Bb6 chord label). For example, both (C,E,Bb) and (C,G) can be note segments of the C7 chord. In this paper, the 3 parameters of a chord are: • the root (the note upon which the chord is built), • the type (the component intervals defining the chord relatively to the root) • the mode (major, minor or undefined) 2. DYNAMIC CHORD ANALYSIS In this section, we first propose a new method for time segmentation by using note segments. The graph of chord candidates is then created, before the dynamic process takes place to select the best path in this graph. 2.1. Time segmentation The first issue in the chord detection process is to find an appropriate time segmentation. Audio based methods [?, ?, ?] use time frames in this purpose. It is possible to adopt a similar approach in symbolic music, by using MIDI ticks (or milliseconds) as time units to set the length of an analysis window. We choose a different approach, and assume that a new chord could only occur with a new instance, i.e. when Figure 2. Above: a musical excerpt. Below: the same excerpt with an homorhythmic transformation. Note segments appear bordered in red. at least one new note starts being played, or stops being played. We thus chose to perform an homorhythmic transformation, as introduced in [?], in order to make all the notes sounding at the same time start and end at the same time as well. Therefore no overlapping between notes occurs. This time segmentation defines the different note segments, each of them being formed by notes starting and ending at the same time. Each of these segments potentially starts a new chord. It is important to note that this transformation does not affect the way music is played and heard. An illustration of this homorhythmic transformation can be found in Figure 2. Once this transformation completed, the enumeration of chord candidates can take place for each note segment. 2.2. Graph of chord candidates The note segment constitutes the observation of the dynamic process. A list of hypotheses is built from each observation, these hypotheses being the chord candidates. In the method proposed, a chord candidate is a pair, composed of a compatible chord and a compatible key. Rule-based algorithms are used to determine which chords and keys are compatible with each note segments. The graph of chord candidates is then built: each chord candidate of a given segment is linked to all the candidates of the next segment, thus forming an directed acyclic graph. 2.2.1. Compatible Keys The proposed method enumerates which keys are compatible with each note segment, by using a rule-based approach. Different rules may be used for that purpose. In this paper, we choose to define a key as compatible if each note of the note segment is a component pitch of this key, which means that each note of the segment must belong to the scale of the key (melodic scale for the minor mode). For example, if the note segment is (C,E), the compatible keys are CMa j, FMa j, GMa j, Amin, Dmin, and Emin, because both C and E belong to the scales of these keys. 2.2.2. Compatible Chords As for compatible keys, several rules may be defined to determine which chords are compatible with a note segment. We use the following rules, and define different conditions for a chord to be compatible with a note segment, depending on the chord type: • Maj/min triad chord: such a chord is compatible if each note in the considered segment belongs to the chord. For example, the note segment (C,E) has two compatible triad chords: Amin and CMa j. • Maj/min 7th chord: such a chord is compatible if each note in the considered segment belongs to the chord and if the root and the Maj/min 7th note are present. Such a rule is to avoid having note segment like (E,G,B) or (E,G) be compatible with C7 Ma j, for example. • Maj/min 9th chord: such a chord is compatible if each note in the considered segment belongs to the chord and if the root, the 5th note and the 9th note are present, • Maj/min 11th chord: such a chord is compatible if each note in the considered segment belongs to the chord and if the root, the 5th note and the 11th note are present. • other chords may also be compatible, like the min75b chord. For these chords to be compatible, each note is required (for example, a min 75b chord wold be compatible if the considered segment contains the root, the minor 3rd, the flat 5th and the minor 7th). The proposed system can be modified in order to accept more chord types in the future. Therefore, new rules for any new chord to be compatible with a given note segment must also be specified. 2.2.3. Chord Candidates Enumerated The chord candidates finally enumerated are all the possible combination of compatible keys and compatible chords. If n chord and m keys are compatible, n x m pairs are enumerated. For example, withCMa j and Amin as compatible chords andCMa j and GMa j as compatible keys, the chord candidates enumerated would be (CMa j,CMa j), (CMa j,GMa j), (Amin,CMa j) and (Amin,GMa j). If no compatible chord can be enumerated for a given note segment, the hypotheses we choose to set are the previous note segment chord candidates combined with the compatible keys. This is to keep taking into account key detection, even if no chord is compatible. If no compatible key can be built, the hypotheses we choose to set are the same ones as for the previous note segment. Both these choices may be justified by the high probability Figure 3. The basic space of the CMa j chord in the CMa j key. Levels (a) to (e) are respectively chromatic, diatonic, triadic, fifths and root levels. of having two consecutive note segments being part of the same chord. Finally, the chord output only mentions a root and a mode (major or minor). In other words, even if the type of a chord is supported by our system, only the corresponding mode is taken into account. For example, if a chord is detected as a C7, it is handled by our system as CMa j. We thus choose to focus on the root and the mode for now, the evaluation of chord types being part of a future work. 2.2.4. Chord Transition Cost Once the chord candidates are enumerated for two consecutive note segments, an edge is built from each of the first segment’s chord candidates to each of the second segment’s. This edge is weighted by a transition cost between the two chord candidates. This transition cost must take into account both the different compatible chords, and the different compatible keys. We thus choose to use Lerdahl’s distance [?] as transition cost. This distance is based on the notion of basic space. Lerdahl defines the basic space of a given chord in a given key as the geometrical superposition of: a the chromatic pitches of the given key (chromatic level), b the diatonic pitches of the given key (diatonic level), c the triad pitches of the given chord (triadic level), d the root and dominant of the given chord (fifths level), e the root of the given chord (root level). Figure 3 shows the basic space of the CMa j chord in the CMa j key. If (Cx,Kx) represents the chord Cx in the key Kx, Lerdahl defines the transition cost from x = (Cx,Kx) to y = (Cy,Ky) as follows: δ (x→ y) = i+ j + k Figure 4. δ ((CMa j,CMa j) → (GMa j,GMa j)) = i + j + k = 1 + 1 + 5 = 7. The underlined pitches are the noncommon pitches. Figure 5. The circle of fifths. where i is the distance between Kx and Ky in the circle of fifths (Figure 5), j is the distance between Cx and Cy in the circle of fifths and k is the number of non-common pitch classes in the basic space of y compared to those in the basic space of x. The distance thus provides a integer cost from 0 to 13, and is completely adequate for a transition cost in the proposed method, since both compatible chords and keys are involved in the cost computation. A calculation of chord transition is illustrated in Figure 4, from x = (CMa j,CMa j) to y = (GMa j,GMa j). Here, i= j=1 because 1 step is needed to go fromCMa j to GMa j in the circle of fifths. k=5 is the number of non-common pitches belonging to the basic space of of y compared to those in the basic space of x (underlined in the Figure). The distance is therefore 1+1+5=7. 2.3. Dynamic process Once the graph between all the chord candidates is formed, the best path has to be found. This task is achieved by dynamic programming [?]. In the graph, from left to right, only one edge to each chord candidate is preserved. Several ways to select this edge can be considered. After experiments, we choose to preserve the edge minimizing the cost
منابع مشابه
Automatic Key Partition Based on Tonal Organization Information of Classical Music
Key information is a useful information for tonal music analysis. It is related to chord progressions, which follows some specific structures and rules. In this paper, we describe a generative account of chord progression consisting of phrase-structure grammar rules proposed by Martin Rohrmeier. With some modifications, these rules can be used to partition a chord symbol sequence into different...
متن کاملPlunging Airfoil Load Characteristics Equipped with Gurney Flap
Numerous experiments have been conducted on plunging Eppler 361 airfoil in a subsonic wind tunnel. The experimental tests involved measuring the surface pressure distribution over the airfoil at Re=1.5×105. The airfoil was equipped with Gurney flap(heights of 2.6, 3.3 and 5% chord) and plunged at 6cm amplitude. The unsteady aerodynamic loads were calculated from the surface pressure measurement...
متن کاملMirex 2010: Chord Detection Using a Dynamic Bayesian Network
We present our submission MD1 to the Chord Estimation Task of the 2010 Music Information Retrieval Evaluation eXchange (MIREX 2010). The front-end chroma generation is based on a new implementation of NNLS Chroma 1 as a Vamp plugin. The higher level model is implemented in Matlab as a dynamic Bayesian network (DBN) with extensive context modelling (metric position, key, chord and bass), leading...
متن کاملBertrand’s Paradox Revisited: More Lessons about that Ambiguous Word, Random
The Bertrand paradox question is: “Consider a unit-radius circle for which the length of a side of an inscribed equilateral triangle equals 3 . Determine the probability that the length of a ‘random’ chord of a unit-radius circle has length greater than 3 .” Bertrand derived three different ‘correct’ answers, the correctness depending on interpretation of the word, random. Here we employ geomet...
متن کاملFinite Element Analysis of Elliptical Chord: Tubular T-Joints
The work presents the Finite element study of the effect of elliptical chords on the static and dynamic strength of tubular T-joints using ANSYS. Two different geometry configurations of the T-joints have been used, namely Type-1 and Type-2. An elastic analysis has been considered. The Static loading conditions used are: axial load, compressive load, In-plane bending (IPB) and Out-plane bending...
متن کامل